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ABSTRACT 



The present invention provides a system and method for 
improving the performance of general purpose processors by 
expanding at least one source operand to a width greater than 
the width of either the general purpose register or the data 
path width. In addition, the present invention provides 
several classes of instructions which cannot be performed 
efficiently if the operands are limited to the width and 
accessible number of general purpose registers. The present 
invention provides operands which are substantially larger 
than the data path width of the processor by using a general 
purpose register to specify a memory address from which at 
least more than one, but typically several data path widths of 
data can be read. The present invention also provides for the 
efficient usage of a multiplier array that is fully used for high 
precision arithmetic, but is only partly used for other, lower 
precision operations. 

48 Claims, 148 Drawing Sheets 
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□ specifier=address+(size/2)+( width/2) 



| depth - 4 bytes] 



width = 16 bytes 



size = depth x width « 64 bytes \ 
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address is aligned to size (64 bytes), 
so low- order 6 bits are zero 
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□ wmc.c contents 
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Operation codes 



W.SWITCH.B 


Wide switch big-endian 


W.SWITCH.L 


Wide switch little-endian 



Selection 
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op 


order 


Wide switch 
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Format 

W.op.order ra*rc,rd,rb 

ra=woporder(rc,rd,rb) 
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Definition 

defWide$witch(op,rd,rc,rb,ra) 
d-«-RegRead(rd, 128) 
c-*-RegRead(rc, 64) 
b**-RegRead(rb, 128) 
ifct.O*0then 

raise AccessDisallowedByVirtual Address 
elseif c 6 0 *0 then 

VirtAddr-#-cand(c.1) 

W wsize (c and (0-c)) || 0 1 

else 

VirAddr^-c 
w-*— wsize-*— 128 

endif 

msize-*-8*wsize 
lwsize-*-log(wsize) 
case op of 

W.SW1TCH.B: 

order -*-B 
W.SWITCH.L: 
order-*— L 

endcase 

m loadMemory(c, VirtAddr,msize,order) 

db-«-d||b 

for i-^-Olo 127 

j-*-0|| ilW8«9-1..0 

k«*— m 7 . w ^J|m6. w ^|m5^jm 4 ^||m3v^lim2» w *jlIm w J| mj 

1 "*~ '7..1ws«eBilw$«».1..0 
dbi 

endfor 

RegWrite(ra, 128, a) 
enddef 
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Operation codes 



W.TRANSLATE.8.B 


Wide translate bvtes biq-endian 
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Wide translate doublets bit-endian 
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Wide translate Quadlets bit-endian 
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Wide translate octlets biq-endian 
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Wide translate bvtes little-endian 
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Wide translate doublets little-endian I 
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Format 

W.TRANSLATE.size.order rd*rc,rb 

rd=wtranslatesizeorder(rc,rb) 

31 2434 1817 1211 65 21 0 
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SZ"*- log(slze) = 3 
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f 

Definition 

def Wide Translate(op,gsize,rd,rc,rb) 
c-*-RegRead(rc, 64) 
b-«-RegRead(rb, 128) 
lgsize-*-log(gsize) 

if c lgsize-4.0 * 0 tnen 

raise AccessDisallowedByVirtual Address 

endif 

lf c 4..lgsiza-3 * 0 then 

wsize-*-(c and (0-c)) || 0 3 
t-*-c and (c-1) 

else 

wsize-*-128 
t-*-c 

endif 

lwsize-*-log(wsize) 
«f t|wsize*4„lwsize-2 * 0 then 

msize-»-(t and (O-tJ)HO 4 
VlrtAddr-*-tand(t-1) 

else 

msize-*-256*wsize 
VirtAddr-*-t 

endif 

case op of 

W.TRANSLATE.B: 

order-*- B 
W.TRANSLATE.L: 

order-*- L 

endcase 

m-#-LoadMemory(c,VirtAddr,msize,order) 

vsize-*-msize/wsize 

lvsize**-Iog(vsize) 

for H-0 to 128-gsize by gsize 

j^((ordeRB)«vsize)A (bhf8jtt . H j Jj'wsize^^ , t0 

endfor 

RegWrite(rd, 128, a) 
enddef 
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Exceptions 

Access disallowed by virtual address 
Access disallowed by tag 
Access disallowed by global T8 
Access disallowed by local TB 
Access detail required by tag 
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W.MUL.MAT.8.B 


Wide multioly matrix siqned bvte biq-endian 


W.MULMAT.8.L i 


Wide multioly matrix siqned byte little-endian 


W.MUL.MAT.16.B 


Wide multioly matrix siqned doublet biq-endian 


W.MUL.MAT.16.1 


Wide multioly matrix siqned doublet little-endian I 


W.MUL.MAT.32.B 


Wide multioly matrix siqned quadlet biq-endian 


W.MUL.MAT.32.L 


Wide multioly matrix siqned ouadlet little-endian 


W.MUL.MAT.C.8.B 


Wide multioly matrix siqned complex byte biq-endian 


W.MUL.MAT.C.8.L 


Wide multioly matrix siqned comolex bvte little-endian 


W.MUL.MAT.C.16.B 


Wide multiply matrix sioned comolex doublet bio-endian 


W.MUL.MAT.C.16.L 


Wide multioly matrix sioned comolex doublet little-endian 


W.MUL.MAT.M.8.B 


Wide multioly matrix mixed-siqned byte biq-endian 


W.MUL.MAT.M.8.L 


Wide multioly matrix mixed-sioned bvte little-endian 


W.MUL.MAT.M.16.B 


Wide multioly matrix mixed-siqned doublet bio-endian 


W.MUL.MAT.M.16.L 


Wide multiply matrix mixed-siqned doublet little-endian 


W.MUL.MAT.M.32.B 


Wide multioly matrix mixed-sianed ouadlet bio-endian 


W.MUL.MAT.M.32.L 


Wide multioly matrix mixed-sioned ouadlet little-endian 


W.MULMAT.P.8.B 


Wide multioly matrix polynomial byte bio-endian 


W.MULMAT.P.8.L 


Wide multioly matrix oolynomial bvte little-endian 


W.MUL.MAT.P.16.B 


Wide multtoiy matrix oolynomial doublet biq-endian 


W.MUL.MAT.P.16.L 


Wide multioly matrix oolynomial doublet little-endian 


W.MULMAT.P.32.B 


Wide multioly matrix oolynomial ouadlet bio-endian 


W.MUL.MAT.P.32.L 


Wide multioly matrix oolynomial ouadlet little-endian 


W.MUL.MAT.U.8.B 


Wide multiply matrix unsioned bvte bio-endian 


W.MUL.MAT.U.8.L 


Wide multioly matrix unsigned bvte little-endian 


W.MUL.MAT.U.16.B 


Wide multioly matrix unsianed doublet biq-endian 


W.MUL.MAT.U.16.L 


Wide multioly matrix unsioned doublet little-endian 


W.MUL.MAT.U.32.8 


Wide muttiolv matrix unsioned ouadlet biq-endian 


W.MULMAT.U.32.L 


Wide multiply matrix unsigned quadlet little-endian 
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Format 
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| W.MINQR.order | rd | re 
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FIG. 14C 
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Definition 

def muKsize.h.vs.v.i.ws.jJas 



r 



1480 



mul^((vs&v« 2 e.l^y" slM ||v sfee . l4iJ ) •((ws&W5i 2 6-H) h ' fiiM ll Wetee-H.j) 



def c«*-PolyMultiply(size,a,b) as 
plOI-*-0 2 ' siZ9 
for k-«-0to size-1 

p(k+1]-«-p[k] * a k ? (O^H fall 0 k ) : 0 2#slze 
endfor 
c-*-p[size] 
enddef 

defWideMultiplyMatrix(major,op,gsize,rd,rc,rb) 
d-*-RegRead(rd, 128) 
c«*-RegRead(rc, 64) 
b-»-RegRead(rb,128) 
igsize-*-log(gsize) 
if C|gsize-4..0 * 0 then 

raise AccessDisallowedByVirtualAddress 

endif 

•fc2..i gs teB-3*0tnen . 
wsize-*-(c and (0-c))|| 0 
t-«-c and (c-1) 

else 

wsize-*-64 
t*-a 

endif 

lwsize-*-log(wsize) 
t|woze+6-l9size..lwsize-3 * 0 then 
msize-«-(tand(0-t))|| 0 4 
VirtAddr-^t and (t-1) 

else 

msize -*-128*w$ize/gsize 
VirtAddr-»-t 

endif 

case major of 
W.MINOR.B: 

order -*-B 
W.MINOR.L: 

order-*-L 

endcase 



FIG. 14D-1 
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case op of 

M.MUt.MAT.U.8, W.MUI.MAT.U.16, W.MULMAT.U.32, 
W.MUL.MAT.U.64: 

ms-*-bs-*-0 
W.MULMAT.M.8, W.MUL.MAT.M.16, W.MUL.MAT.M.32, 
W.MUL.MAT.M.64 

ms-*-0 

bs-«-1 

W.MUL.MAT.8, W.MUL.MAT.16, W.MULMAT.32, 
W.MUL.MAT.64, W.MUL.MAT.C.8, W.MUL.MAT.C.16, 
W.MUL.MAT.C.32. W.MULMAT.C.64: 

ms-*-bs-*-1 
W.MULMAT.P.8. W.MULMAT.P.16, W.MUL.MAT.P.32, 
W.MUL.MAT.P.64: 
endcase 

m ■*-LoadMemory(c,VirtAddr,msize,order) 
h -*-2*gsize 

for i -*-0 to wsize-gsize by gsize 

for j-*-0 fo vsize-gsize by gsize 
case op of 

W.MUL.MAT.P.8, W.MUL.MAT.P.16, 
W.MULMAT.P.32, W.MUL.MAT.P.64: 
k-*-»+wsize*j fl ..igsize 

qlj+gsize] q(j] A PolyMultiply(gsize.m k ^ s i Z9 .i„k, 
bHaze-1..j) 

W.MUL.MAT.C.8, W.MUL.MAT.C.16, W.MULMAT.C.32. 
W.MULMAT.C.64: 

if & gsize = 0 then 

k-^i.(j&gsize)*wsize*i8..i0siz0*i 
qQ*gsize)-#- q{i] ♦ mul(gsize,h,ms,m,k,b$,b.i) 

else 

k i*gsize+wsize*j 8 ..i fl8 i Z e*i 
q[i+gsizel-^qlil = mu!(gsize,h,ms,m,k,bs,b,j) 

endif 
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r 



1480 



W.MULMAT.8, W.MULMAT.16. W.MULMAT.32, 
W.MULMAT.64, W.MUL.MAT.M.8, W.MUL.MAT.M.16, 
W.MUL.MAT.M.32, W.MUL.MAT.M.64, W.MUL.MAT.U.8, 
W.MUL.MAT.U.16, W.MULMAT.U.32, W.MUL.MAT.U.64 
q[i-Kjsize] -«-q[i} * mul(gsize,h,ms,m,i-»ws!ze* 

js..l9SiZ0 l b8,b.P 

endfor 

a 2'flsizo-i*2'i..2*i -*-qlvsizel 
endfor 

3l27..2Nreize-«- 0 

RegWrite(rd, 128. a) 
enddef 
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1510 



Operation codes 



W.MUL.MAT.X.B 


Wide multiply matrix extract big-endian 


W.MUL.MAT.X.L 


Wide multiply matrix extract little-indian 



Selection 



class 


op 


order 


Multiply matrix extract 


W.MUL.MAT.X 


B L 



Format 

W.op.order ra=rc,rd,rb 

ra a wop{rc,rd,rb) 

31 2423 1817 

I W.op.order | rd I ~ 



1211 



65 



rc 



rb 



ra 



8 



FIG, 15A 
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1520 



31 



fsize 
8 



2423 



16151413121110 9 8 



I dpos |x |s|njm| |[ rndl 



8 



11111 2 



gssp 
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r 



1530 



1023 m[rc}(12n28/si2e) 



127 



rd(128) 



\ext^act/}^gxl^ 

Vxtract/ X pxtrac^ Xaxfa^/ I ^Sxtfac/ rb ( 32 ) 

i i l i i l i — i 



i l i i l i l i 
128 ra(128) 0 

Wide multiply matrix extract doublets 
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r 



1560 



511 rc(64»128/size) 




rd(128) 



rb(32) 



128 ra(128) 0 

Wide multiply matrix extract complex doublets 
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Definition ^ — 1580 
det muKsize.h.vs.v.i.ws.w.j) as * 

muM- ((vs&v S j M .i*i) h -s i 2«||v S j M -i*i..i) ' ((ws&w s n,.i*j)h-sizeHw si2e . 1 ^ ,) 
enddef 

def WideMultiplyMafrixExtract(op,ra,rb,rc,rd) 
d-*-RegRead(rd, 128) 
c-*-RegRead(rc, 64) 
b-*-RegRead(rb, 128} 
case bo of 
0..255: 

sgsize-«-128 
256.383: 

sgsize-«-64 
3S4..447: 

sgsize-*-32 
448..479: 

sgsize-*-16 
480..495: 

sgsize-*-8 
496..503: 

sgsize-*-4 
504..507: 

sgsize-*-2 
508..511: 
sgsize«*-1 

endcase 

l-^bii 

m-#-bi2 

n-^bi3 

signed-«-bi4 

if c 3 <0 * 0 then 

wsize-*-(c and (0-c))|| 0 4 

t-«-cand(c-1) 

else 

wsize*«-128 
t-*-c 
endif 

if sgstze < 8 then 

gsize-+-8 
elseif sgsize > wsize/2 then 

gsize-<~wsize/2 

else 

FIG. 15E-1 
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1580 



gsize-4-sgsize 
endif 

lgsize-*-log(gsize 
lwsize-#-log(wsize) 

' f l lwsiM«€.n.l9slze..lwsi2er3 * 0 tnen 

msize (t and (0-1)) || 0* 
VirtAddr-«-tand(t-1) 

else 

msize «*- 64*(2-n)*wsize/gsize 
VirtAddr-*-t 
endif 

vsize -*-{1+n)*msize*gsize/wsize 

mm -*- LoadMemory(c,VirtAddr,msize,order) 

Imsize log(msize) 

if (VirtAddr| mS j Ze . 4 o* 0 then 

raise AccessDisallowedByVirtualAddress 
endif 

case op of 

W.MULMAT.X.B: 
order-*- B 
W.MULMAT.X.l: 
order-*- L 

endcase 

ms-*- signed 

ds-*- signed A m 

as-*-signed or m 

spos -*- (bA„o) and (2*gsize-l) 

dpos-*-(0 1] 1)23.16) and (gsize-1) 

r-*-spos 

sfsize -*-(0|| 1>3 1 24) and (gsize-1) 

tfsize -*-(sfsize = 0) or ((sfsize+dpos) > gsize) ? gsize-dpos : sfsize 
fsize -*- (tfsize «■ spos > h) ? h - spos : tfsize 
if (bio 9 = Z) & -signed then 
rnd-*-F 

else 

rnd-*-bio..9 
endif 
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_ .1580 

for i -*-0 to wsize-gsize by gsize 
q(0J 0 2 *9»2»*7-l9aie 

for j-*- 0 to vsize-gsize by gsize 
if n then 

if (~) & j & gsize = 0 then 

k-*- i-O&gsize)+wsize*j 8j08ize+1 
q[i*gsize]-«- q(i] + mul(gsize,h,ms,mm,k,ds,d,i) 

else 

k i+gsize+wsize'js.jggj^^t 
q(i+gsize]-*-q[i] - mul(gsize,h,ms,mm,k,ds,d,j) 

endif 

else 

qp+gsizej-*- qjij s mul{gsize,h,ms,mm,i+j*wsize/gsize,ds f d,j) 

endif 
endfor 
P — qP28) 
case rnd of 

none, N: 

s^<MI~p r llp»-i 



s-*-0 h 
s ^-oh-r||l r 



BndcdSG 

v^((ds4pM)||p)*(0||s) 

« (Vh..rfstz9 s (« * Vh^.,)^-'-™") or not I then 

w -*-(as 4 v^e-iP'"-^^^!^^.,^..^^ 8 

else 

w-*-(s ? (v^i-vfsfce-dpos-lj : l^ 26 -^P 08 ) \\0^P° S 

endif 
endfor 

ai27..v»size-^0 

RegWrite(ra, 126, a) 
enddef 
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p^- 1610 

Operation cpdes 



W.MUL.MAT.X.I.8.B 


Wide multipty matrix extract immediate signed byte big-endian 


W.MULMAT.X.I.81 


Wide multiply matrix extract immediate signed byte little-endian 


W.MULMAT.X.U6.B 


Wide multiply matrix extract immediate signed doublet big-endian 


W.MUL.MAT.X.M6.L 


Wide multiply matrix extract immediate signed doublet little-endian 


W.MULMAT.X.I.32.B 


Wide multiply matrix extract immediate signed quadlet big-endian 


W.MULMAT.XJ.32L 


Wide multiply matrix extract immediate signed quadlet little-endian 


W.MUL.MAT.XJ.64.B 


Wide multiply matrix extract immediate signed octlets big-endian 


W.MUL.MAT.X,I.64.L 


Wide multiply matrix extract immediate signed octlets little-endian 


W.MULMAT.X.l.C.8.8 


Wide multiply matrix extract immediate complex bytes big-endian 


W.MUL.MAT.X.I.C.8.L 


Wide multiply matrix extract immediate complex bytes litt e-endian 


W.MUL.MAT.X.I.C.16.B 


Wide multiply matrix extract immediate complex doublets big-endian 


W.MUL.MAT.X.I.C.16.L 


wide multiply matrix extract immediate complex doublets lit! e-endian 


W.MUL.MAT.X.I.C.32.B 


Wide multiply matrix extract immediate complex quadlets big-endian 


WMULMAT.X.I.C.32.L 


Wide multiply matrix extract immediate complex quadlets little-endian 



Selection 



class 


op 


type 


size 


order 


wide multiply 


W.MULMAT.X.I 


NONE 


8 16 32 64 


LB 


extract immediate 




C 


8 16 32 


LB 


Format 

W.op.tsize.order rd=rc,rb, i 
rd=woptsizeorder(rc.rb.i) 

31 24 23 18 17 


1211 




6 5 4 32 


0 


I W.op.order 


I rd I rc I 


rb 


1 1 1 sz | sh | 


8 


6 


6 


6 


1 2 


3 



sz-*- log(size) - 3 
assert size+3 2 i > size-4 
sh • - size 



FIG. 16A 
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1630 



1023 m{rc|(128*128/size) 



III 



127 



rd(128) 



\extracj/ , r yxtracl/ , . Vxtracj/ , , Vxtract/ , , 
\oxtracj/ | Vxtracj/ ^jxjracp 7 Vxtracj / 



i i i i 



128 rd(1 28) 0 

Wide multiply matrix extract immediate doublets 
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^—1680 

Definition 

def mul{$ize,h,vs,v,i.ws,w,j) as 

mul ^«VS&Vsize-1*i)h-s«e|| Ww..i) '({ws&Wsiie-H)^ 1 ' 6 !! W^e-H.}) 
enddef 

def WideMuitiplyMatrixExtractimmediate(op,type,gsize,rd,rc,rb,sh) 
c ««-RegRead(rc, 64) 
b-*-RegRead(rb, 128} 
lg$ize-*-log(gsize) 
case type of 
NONE: 

if C| gS i M .4..o * 0 then 

raise AccessDisallowedBy VirtualAddress 
endif 

«^3..lgsize.3*0then 

wsize-»-(c and (0-c))|| 0 4 
t-*-c and (c-1) 

else 

wsize-«-128 
t-*-c 
endif 

lwsize-*-log(wsize) 

•f t|wsize+6.|gsize..lwsize-3 * 0 then 
msize-*-(t and (0-t)) || 0 4 
VirtAddr-*-tand (t-1) 

else 

msize 1 28*wsize/gsize 
VirtAddr-^t 

C: 

'f C| gS tze-4..0 * 0 then 

raise AccessDisallowedBy VirtualAddress 

endif 

Nc3„i 9 8ize-3*0then 

wsize-McandfO-cNUO 4 
t-*-c and (c*1) 

else 

wsize-*-128 
t-«-c 
endif 

lwsize-*-log(wsize) 

if tfwaze+5-lgsize..lwsize-3 * 0 then 

msize-*-(tand(0-t))|j 0 4 

FIG. 16D-1 
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VirtAddr-*- 1 and (t-1) ^1680 

else 

msize -*-64*wsize/gsize 
VirtAddr-*-t 

endif 

vsize -*-2*msize*gsize/wsize 

endcase 
case of of 

W.MULMATXI.B: 

orders B 
W.MULMATXI.L: 
order-*- L 

endcase 

as-*-ms-*-bs-«-1 

m LoadMemory(c,VirtAddr,msize,order) 
h -*-(2*gsize) + 7 - lgsize-(ms and bs) 
r •*-gsize + (sh!||sh) 
for-*- 0 to wsize-gsize by gsize 
q[0|-*-02'0size+7-lgsize 

for j 0 to vsize-gsize by gsize 
case type of 
NONE: 

qfj+gsize] -»-q[i] ♦ mul(gsize,h,ms,m,i+wsize* 
l8..lgsize,bs > b.j ) 

C: 

if (~i) & j & gsize = 0 then 

k ^i-0&gsi2e)*wsize*j & .ta8i2e*l 
q[j*gsizel-+-q(il ♦ mul(gsize,h,ms,m,k,bs,b J i) 

else 

k ^i+gsize*wsize*j 8Jj5iz ^ 1 
qU+9Sfcel-«-qrjl - mul(gsize,h,itis,m,k,bs,b,J) 
endif 

endcase 
endfor 
P-*-q(vsize] 
s^-OHl -p r |l pM 
v-M(as&p M )||p)+(0||s) 
»f (Vh..r*ssi2e s (as & Vf&B+i )h*1-r-flsize then 

8gsize-1«U -*-Vg 8 ize-1*i..r 

else 

a 8 size.i*u-«- as ? (v h ||~v? 8iM - 1 ) : Igsize 

endif 
endfor 

3l27..wsi» 0 
RegWrite(rd. 128, a) 
enddef FIG. 16D-2 
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^-1710 



Oaer atlon codes 



W.MULMAT.C.F.16.B 


Wide multiply matrix complex floating-point half big-endian 


W.MULMAT.aF.16,L 


Wide multiply matrix complex floating-point little-endian 


W.MULMAT.C.F.32.B 


Wide multiply malrix complex floating-point single big-endian 


W.MULMAT.C.F.32.L 


Wide multiply matrix complex floating-point single little-endian 


W.MULMAT.F.16.B 


Wide multiply matrix floating-point half big-endian 


W.MULMAT.F.16.L 


Wide multiply matrix floating-point half little-endian 


W.MUL.MAT.F.32.B 


Wide multiply matrix floating-point single big-endian 


W.MUL.MAT.F.32.1 


Wide multiply matrix floating-point single little-endian 


W.MULMAT.F.64.B 


Wide multiply matrix floating-point double big-endian 


W.MUL.MAT.F.64.L 


Wide multiply matrix floating-point double little-endian 



Selection 



class 


op 


type 


prec 


order 


wide multiply matrix 


W.MULMAT 


F 


16 32 64 


LB 


C.F 


16 32 


LB 



Format 

W.op.prec.order rd=rc,rb 

rd=wopprecofder(fc,rb) 

31 2423 18 17 12 11 65 21 0 

I W.MlNOR,ordef I rd | rc I rb | W.op | pT| 
8 6 6 6 4 2 

Pr-^tog(prec)-3 
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1730 



1023 m[rcl 
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fb(128) 
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1760 



l l l i i l i l — > 
128 rd(128) 0 

Wide multiply matrix complex floating-point half 
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Definition 

defmul{size,v,i,w,j) as 

mul -*-fmul(F(si2e,v 8 j 2 e.w..i),F(size,w 8i2 e.H..i )) 
enddef 



defWideMultiplyMatrtxFloatingPoint(major,op,gst2e,rd,rc,rb) 
c-+- RegRead(rc, 64) 
b-*-RegRead{rb, 128) 
lgsize-*-tog(gsize) 
switch op of 

W.MULMAT.F.16, W.MUL.MAT.F.32. W.MUL.MAT.F.84: 

jf c, gsi »^ ,.o * 0 then 

raise AccessDisallowedByVirtualAddress 
endif 

«fC3..i 9 siz»-3*0then 

wsize -«-(c and (0-c))|| 0 
t-#-c and (c-1) 

else 

wsize 128 
t-«-c 

endif' 

lwsize-*-log(wsize) 

W t| w6 ize*6-lgsize..Wwize-3 * 0 

msize-«-(tand(<M))||0 4 
VirtAddr-*-tand (H) 

else 

msize -*-128*wsize/gsize 
VirtAddr-*-t 

endif 

vsize-*- msize'gsize/wsize 
W.MUL.MAT.C.F.16, W.MULMAT.C.F.32, W.MUL.MAT.C.F.64: 
^^..0*0 then 

raise AccessDisallowedByVirtualAddress 

endif 

N c 3..igstze-3*0t n e n 

wsize-*- (c and (0-c))|| 0 4 

\~+-c and (c-1) 

else 

wsize -«-128 
t-*-c 
endif 

Iwsize-*- log(wsize) 
iH|wsi2e*5.|gsize..lw6ize-3 * 0 then 

FIG. 17D-1 



Case 2:05-cv-00505-TJW Document 68 Filed §2/08/2008 Page 46 of 87 
U.S. Patent Apr. 20, 2004 Sheet 44 of 148 US 6,725,356 B2 



^ — 1780 

msize-«- (t and (0-t))|| 0 4 * 
VirtAddr-*-t and (t-1) 

else 

msize-*-64*wsize/gsize 
VirtAddr-*-t 

endif 

vsize -*-2*msize*gsize/wsize 

endcase 
case major of 
M.MINOR.B: 

orders B 
M. MINOR. L 
order-*- L 

endcase 

m LoadMemory(c,VirtAddr,msize,order) 
for i-*-0 to wsize-gsize by gsize 
q[0J.t^NULL 

for j-«-0 to vsize-gsize by gsize 
case op of 

W.MUL.MAT.F.16, W.MUL.MAT.F.32, W.MULMAT.F.64: 
qfj+gsizej -«-faddq[jl, mul(gsize,m,i*wsize* 

i8..lgslz8*i .bJ)) 
W.MULMAT.C.F.16. W.MUL.MAT.C.F.32, 
W.MUL.MAT.C.F.64: 

if (~i) & j & gsize = 0 then 
k •*-i-(j&gsize)-*wsize*j 8 . 
qfi+gsize) -*-faqqlj], muUgsize.m.k.bj)) 

else 

k-»- i+gsize+wsize*j 8 ..igsi2e*i 
q(j«gsizej -«-fsubqlj], mul(gsize,m,k,b,j)) 
endif 

endcase 
endfor 

a fl size-i-*.l-«- qlvsize] 
endfor 

ai27..wsfce-»- 0 
RegWrite(rd, 128, a) 
enddef 
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Operation codes 



1810 



W.MULMAT.G.8.B 


Wide multiply matrix Galois bytes bia-endian 


W.MULMAT.G.8.L 


Wide multiply matrix Galois bytes little-endian 



Selection 



class 


op 


size 


order 


Multiply matrix Galois 


W.MULMAT.G 


8 


B L 



Format 

W.op.order ra=rc,rd,rb 
ra=\wporder(rc,rd,rb) 



31 24 23 18 17 

1 W.op.order I rd I 



12 11 



6 5 



rc 



8 



6 



I 



rb 



ra 
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Definition >*-1860 
def c-*- Poly Multiply (size.a.b) as f 

for k-*-0 to size-1 

p[k+1J-*-pIkl * a k ? (0Si2o-k|| b |j 0 k } . 0 2'size 
endfor 
c-+-p|sizej 
enddef 

def c-*-PolyResidue(size,a,b) as 
PlOJ — a 

for k-«-size-1 to 0 by-1 

P(k-1]— P(kJ * p[0] 8i2e+ll ?(0 s,ze k |l 1 1 H b|| 0 k ) : 0 2 * 8iM 
endfor 

c-^-plsizelsfr^.o 
enddef 

defWideMuitip!yMatrixGalois(op,gsize,rd,rc/b I ra) 
d-*-RegRead(rd, t28) 
c-*-RegRead(rc, 64) 
b-*-RegRead(rb,128) 
lgsize-«-log(gsize) 
if cigsize-4..o * 0 then 

raise AccessDisallowedByVirtualAddress 

endif 

"C3..iosi29-3*Othen 

wsize -*-(c and (0-c)) || 0 4 
t-*-c and (c-1) 

else 

WS!Ze**-128 

endif 

lwsize-*-log(wsize) 

if tlwsize«6-Jgsize..lw6iZ8-3 * 0 then 
msize-*-(t and (0-t))|| 0 4 
VirtAddr-*-tand(t-1) 

else 

msize -*-128*wsize/gsize 
VirtAddr-*-t 

endif 

case op of 

W.MULMAT.G.8.B: 
orders B 

W.MULMAT.G.8.L: 

order -*-L 
endcase pjQ 
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r 



1860 



m LoadMemory(c, VirtAddr.msize,order) 
for i-*-0 wsize-gsize by gsize 

for j-*- 0 to vsize-gsize by gsize 
k-*- »-«wsize*)8..igsize 

qfj+gsizel-fr-qfjj * PotyMultiply(gsize,mi ( ^ 8iie . 1 .. k .dj^aa.t.j ) 
endfor 

a9size-i*i..i^PolyResjdue(gsize,qJvsizeJ,b95ize-i..O) 
endfor 

ai27..wstee-«- 0 
RegWrite(ra,128, a) 
enddef 



FIG. 1SC-2 
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Exceptions 

Access disallowed by virtual address 
Access disallowed by tag 
Access disallowed by global TB 
Access disallowed by local TB 
Access detail required by tag 
Access detail required by local TB 
Access detail required by global TB 
Local TB miss 
Global TB miss 
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FIG. 18D 
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1910 



Operation codes 



E.MULADD.X 


Ensemble multiply add extract 


E.CON.X 


Ensemble convolve extract 



Format 

E.op rd@rc,rb.ra 
rd=gop(rd,rc,rb.ra) 



31 24 23 18 17 12 11 



8 



_ _ 6 5 0 

I rd I rc I rb I ra I 



6 



6 



6 



6 



fig. m 
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1910 



Figures 198 and 20B has blank fields: should be. 
I fsize I dpos 



ODQEDDjU 



9SSP | 



fig. m 
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r 



1930 



127 



rc(128 



"3 



\extfacj/ i Aextract/ . r \pxtract/ , ,\ pxtrac|/ , 



' _ t " f 



TJTF 



-J 



127 



rb(1 28) 



) 0 



i i > i en 




128 rd(128) 0 

Ensemble multiply add extract doublets 



FIG. 19C 
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rb(128> 



128 rd(128) 0 

Ensemble complex multiply add extract doublets 

This ensemble-multiply-add-extract instructions (E.MULADD.X), when 
the x bit is set, multiply the low-order 64 bits of each of the rc and rb 
registers and produce extended (double-size) results. 



FIG. 19D 
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Ensemble convolve extract doublets 



FIG. WE 
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Definition ^1990 
def mul(size,h,vs,v,i,ws,w,j) as '* 

muM- ((vs4v 6 i Z e-i + i)h-«28||v 8 i Z e.i + i. J * ((TO&Wsfce.i^h-dM|{wtee-H.|) 
enddef 

def EnsembleExtractlnplace(op,ra,rb,rc,rd) as 
d-*-RegRead(rd, 128) 
c-«-RegRead(rc, 128) 
b-*-RegRead(rb, 128) 
case bs..o of 
0..255: 

sgsize -*-1 28 
256..3S3: 

sgsize-*-64 
384..447: 

sgsize -«-32 
448..479: 

sgsize -*-16 
480. .495: 

sgsize -«-8 
496..503: 

sgsize-*-4 
504..507: 

sgsize -*-2 
508..511: 
sgsize 

endcase 

l-*-an 

m-«-ai2 

signed -*-ai4 
x-*-a 15 
case op of 

E.CON.X: 

if (sgsize < 8) then 

gsize-*-8 
eiseif (sgsize*(n-1)*(x-»-l) > 128 then 
gsize-»-128/(n-1)/(x + 1) 

else 

gsize-«- sgsize 

endif 

Igsize tog(gsize) 
wsize 128/(x«-1) 

FIG. 19G-1 
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vslze -«-128 
ds-«-cs-*- signed 
bs-*- signed A m 
zs-*- signed or m orn 
zsize -*-gsize*(x+1) 
h-*- (2*gsize) ♦ iog(vsize) - igsize 
spos-*- (aa.o) and (2*gsize-1) 

E.MULADD.X: 

tf(sgsize < 9) then 

gsize-*-8 
elseif (sgsize*(n+1)*(x<M) > 128) then 

gsize-*- 128/(n+1)/(x+1) 

else 

gsize -«-sgsize 
endtf 

ds-«- signed 
cs-*- signed A m 
zs-*- signed or morn 
zsize-*- gsize*(x+1) 
h-*-(2*g$ize) + n 
$pos-*-(a 8 0 ) and (2*gssize-1) 

endcase 

dpos-«-(0|| 823.^6) and (zsize-1) 
r-*-$pos 

sfsize -*-(0|| a3t..24) and (zsize-1) 
tfsize-*- (sfsize = 0) or ((sfsize+dpos) > zsize) ? zsize-dpos : sfsize 
fsize -*- (tfsize ♦ spos > h) ? h - spos : tfsize 
if (bio 9 = 2) and not as then 
md-*-F 

else 

md-*- D10..9 
endif 




FIG. 19G-2 
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1990 

(or k 0 to wsize-zsize by zsize 
i-#-k*gsize/zsize 
case op of 

E.CONX 

for 0 to vsize-gsize by gsize 
if n then 

if(-*) & j & gsize = 0 then 

qlHsize)-*- qjj] ♦ mul(gsize l h l ms > m f i^ 

else 

qD+gsize] q[jl • mul(gsize,h,ms,i+ 
128-j+2*gsize,bs,bj) 
endif 

else 

q(H)size| «*-q[jl «■ mui(gsize t h,m$,m t i«- 
128-j,bs,bj) 

endif 
endfor 
p-*-q(vsize] 
E.MULADD.X: 

di -*-((ds and dk+rize-1 )h-zsize-r|| (duzsize-L X Ml 0 r ) 
if n then 

if ( i and gsize) = 0 then 

p muHgsize.h.ds.d.Us.ci)- 
muHgsize.h.ds.dj+gsize.cs.cj+gsize)^! 

else 

p^mui(gsize f h,ds,d/i,cs f cj^size)^ 
endif 

else 

p^mui(gsize,h,ds,d,i,cs,c,i) ♦ di 

endif 

endcase 



FIG. 19G-3 
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.1990 



case rnd of 

N: 

S^0 h 1hp r {|pM 
s-»-0 h 

C: 

s 0 M ll 1' 

endcase 

^-r-«w * ph-OH p> * con «> 

SV^a 5 & vMtt*i)**i+«w) or not (\ and (op * 
EXTRACT)) then 

else ( " & VMsl2e - l)lSlze * f ^ p0$ llv fS i 2 e.w.. f II 0** 



Zzsize-1Jc..k-*~w 
endfor 

RegWrite(rd, 128. 2) 
enddef 



F/G. J9G-4 
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2010 



Operation, 9Q«>es 



E.MUL.X 


Ensemble multiply extract 


E.EXTRACT 


Ensemble extract 


E.SCALAOD.X 


Ensemble scale and extract 



Format 

E.op ra=rd,rc,rb 
ra*eop(rd,rc,rb) 

31 24 23 18 17 12 11 6 5 0 

1 E.op I rd I re I rb I ra I 
8 6 6 6 6 



F/G. 20A 
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Figures 19B and 20B has blank fields: should be. 



[ 



(size 



I 



dpos 



FIG. 20B 
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FIG. 200 
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r 



2030 



127 



rd(128) 



1 




0 

o 


o 
o 


o 

0 


o 

0 


0 

o 


<> 




i 


i O 


o 


o 


o 


o 




■ • - . . - 




o 


o 


o 


o 


o O 


0 




S{ 


; o 


o 


6 


o o 


o 






o 


o 


o 


o o 


o 


o 


<► 






o 


a <> 


o 


o 


o 


M 


I 


u 

yftXtfBC 


ir 




If 

spxtra 




extract/ Ip 



\extracV Extract/ Vxtracj/ Vxtracj/ 

v T TT Y 



127 



rc(128) 



C 

128 ra(128) 0 

Ensemble complex multiply extract doublets 

This ensemble-multiply-extract instructions (E.MUL.X), when 

the x bit is set. multiply the low-order 64 bits of each of the re and rb 

registers and produce extended (double-size) results. 



FIG. 20D 
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r 



2040 



127 



127 



rd(128) 
rc(128) 



\ t ♦ 

\extf act/ 1 Vxtr ac\/ , , \extracl/ 

[ yxtfacj/ [ \8xtfac</ V""" 

III I ZZZ 
128 ra(128) 



95 
80 

0 rb(1 28) 

"79 

64 



Vxtracj/ 



Ensemble scale add extract doublets 



FIG. 20E 
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r 



2050 



127 rd(128) 0 




Ensemble complex scale add extract doublets 

The ensemble-scate-add-extract instructions (E.SCLADD.X), when the x bit 
is set, multiply the low-order 64 bits of each of the rd and re registers by the 
rb register fields and produce extended (double-size) results. 
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2060 




rd||rc 





s 


ab 


0 




fsize 
-+ »> 





ra 



Ensemble extract 



FIG. 20G 
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2070 



fsize spoc 
« » < — 



st 



rd 



3 



9size\ 



rc 



ra 



fsize 



»« d P 0S 



Ensemble merge extract 



FIG. 20H 
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2080 



fsize spos 



St 




rd 


J? 


\ 


gsize\ 










\ 




1 s 


a 


0 



rd 



ra 



fsize 



m t dpos 



Ensemble expand extract 



FIG. 201 
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Definition 2090 
der mui(size,h,vs,v,i.ws,w.j) as 



muH*- ((VS4vsize-i^)h-si»|jvy„. 1+jJ ) * ((ws&Wsi 2e ^)h-stee|| w ^ H j) 



enddef 

def EnsembleExtract(op,ra,rb,rc,rd) as 
d-*-RegRead(rd, 128) 
c-*-RegRead(rc. 128} 
b-*-RegRead(rb, 128) 
case b&.o of 
0..255: 

sgsize-*-128 
256.383: 

sgsize-»-64 
384..447: 

sgsize-«-32 
448..479: 

sgsize-*-16 
480..495: 

sgsize-*-8 
496..503: 

sgsize-»-4 
504..507: 

sgsize-+-2 
508.511: 

sgsize**-1 

endcase 

m-*-bi2 
n-*-bi3 
signed-*- bu 

case op of 

E.EXTRACT: 

gsize sgsize # 2(2-(m or x)) 
zsize-*- sgsize 
h-*- gsize 
as-*- signed 

spos-»-{b3..o)and (gsize-1) 



FIG. 20J-1 
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E.SCALADD.X: 

if (sgsize < 8) then 

gsize-*-8 
elseif (sgsize*(n+1) > 32) then 

gsize-*-32/(n+1) 

else 

gsize-*- sgsize 
endif 

ds-*- cs-*- signed 
bs-#- signed A m 
as-*- signed or morn 
zsize -*- gstze # (x«-1) 
h-*-(2*gsize) + 1 + n 
spos -*-(t>8..o) and (2*gsize-1) 
E.MUL.X: 

if (sgsize < 8) then 

gsize-*- 8 
elseif (sgsize*(n+1)*(x+1) > 128) then 

gsize -*-128/(n+1)/(x+1) 

else 

gsize-*- sgsize 

endif 

ds-*- signed 

cs-*-signed*m 

as-*- signed or morn 

zsize -*-gsize*(x+1) 

h-*-(2*gsize) *n 

spos-*- (ba.o) and (2*gsize-i) 

endcase 

dpos-*- (0|| b23,.ie) and (zsize-1) 
r-*-spos 

sfsize — • — <0|| D31..24) and (zsize-1) 

tfsize -*- (sfsize -0) or ((sfsize+dpos) > zsize) ? zsize-dpos : sfsize 
fsize-*- (tfsize ♦ spos > h) ? h • spos : tfsize 
if (bio .9 S Z) and not as then 
rnd-*-F 

else 

rnd-*-b 
endif 

FIG. 20J-2 
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for 0 to 128-zsize by zsize y- 2090 

»"*" i'gsize/zsize 
case op of 

E. EXTRACT: 
if m or x then 

P^- dgsize -h-1..i 

else 

p-*-(d|| C)gslZB-H-1..i 
endif 
E.MULX: 
if n then 

if (i and gsize) = 0 then 

p mul(gsize,h,ds.d,i,cs,c.i)- 
muKgsize.h.ds.d.i+gsize.cs.c.i+gsize) 

else 

p^- 

mul(gsize,h,ds,d,i,cs,c,i+gsize)+mul(gsize,h,ds.d,i.cs,c/i+gsize) 

endif 

else 

p mul{gsize,h ,ds,d,i,cs,c,i) 
endif 
E.SCAL.ADD.X: 
if n then 

if (i and gsize) « 0 then 

p iDul(gsize.h,ds,d,i,bs,b,64*2 , gsize) 

♦ mul(gsize.h,cs,c,i,bs,b.64) 

• mul(gsize,h,d$,d,i+gsize,b$,b,64+3*gsize) 
-mul(g$ize,h,c$,c.i+g$ize,b$,b,64+$size) 

else 

p mul(gsize,h l ds,d,i,bs,b,64+3*g$ize) 
+ mul(gsize,h,cs.c,i.bs.b,$4+gsiz») 
+ mul(gsize,h.ds,d,i*g$ize,b$,b,64+2*gsize) 
+ mul(gs»ze,h,cs,c,i+gsize,bs 1 b,$4) 

endif 

else 

p~«_ mul(gsize,h,ds,d.i,bs,b,64+gsize) ♦ mut(gsize 
.h.cs.cj.bs.b^) 

endif 

endcase 



FIG. 20J-3 
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case rnd of ^— 2090 

N: ' 
s^-OMhpJIpM 

s^0 h - r ||^ t 

F: 

s-+-0 h 

C: 

s— &*\\V 

endcase 

v *((as & p h -i)llp) * (Oils) 

if (Vh..ffsizB= (as & VHfci M .i) h * 1 - r -' siZ8 ) or not (I and (op = 
E.EXTRACT)) then 
W-»- (as & Vr^e^jzrize-fsize-dposjl^^.^ r |j0dpo« 

else 

w (s ? (v h || -vft 8 '"^ 0 *- 1 ) : i*s»»-<lp°S)|| Qdpos 
endif 

if m and (op = E.EXTRACT) then 

Zz»z»-H..i Ca«ize.H..dpo$*lsi2e*i||w d p 0$ ^ 8 , z> . 1><d p 08 || 
Cdpos-H.) 

else 

Z2size-1 
endif 
endfor 

RegWrite(ra, 128, z) 
enddef 



FIG. 20J-4 
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2110 



data 



Gateway with pointers to code and data spaces 



FIG. 21 A 
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2130 



Typical dynamic-linked, inter-gateway calling sequence: 
caller: 



caller 



calee: 



AA.DDI 


sp@-size 


// allocate caller stack frame 


S.I.64.A 


ip.sp.oif 




S.I.64.A 


dp.sp.off 




... 

L.I.64.A 


lp=dp,off 


// load Ip 


L.I.64.A 


dp-dp,off 


//load dp 


B.GATE 






LIMA 


dp.sp.off 




...(code using dp) 






LI.64.A 


lp=sp.off 


// restore original Ip register 


A.AOOI 


sp=size 


// deallocate caller stack frame 


B 


IP 


// return 


(non-leaf): 






L.I.64.A 


dp=dp,off 


// load dp with data pointer 


S.I.64.A 


sp,dp,off 




LI.64.A 


sp=dp,off 


// new stack pointer 


S.I.64.A 


lp.sp.off 




S.I.64.A 


dp.sp.off 




...(using dp) . 






U.64.A 


dp,sp,off 




...(code using dp) 






U.64.A 


lp=sp.off 


// restore original Ip register 


LI.64.A 


sp-sp.off 


// restore original sp register 


B.OOWN 


IP 





callee (leak, no stack): 

callee: ...(using dp) 
B.OOWN 



FIG. 21 B 
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2160 



Operation codes 



B.GATE 


Branch gateway 




Equivalencies 


| B.GATE 


B.GATE0 


1 



Format 
B.GATE 
bgale(rb) 
31 



rb 



24 23 



I B. MINOR | 0 



S 



18 17 12 11 6 5 0 

I 1 | rb | B.GATE I 



6 



6 



FIG. 21C 
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Branch gateway 



FIG. 21 D 
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Definition 

del BranchGateway(rt t rc,rt) as 
c<-RegRead(rc,64) 
b RegRead(rt>. 64) 
if(rd*0)or(fc*i)then 

raise Res&vedlnstructron 

emit 

•fC2..0*0then 

raise AccessOisalkwedByVtrtuaiAddress 

endif 

d <- ProgramCounter 63 J >-M || Privilegelevel 
if Privilegelevel <bi.. 0 then 

m <- LoadMemoryG(c.c,64.L) 
ifb*mthen 

raise GatewayDisaHowed 

enaif 

PrivitegeLevett-b 10 

endif 

ProgramCounter «- bg3 2 |j 0 2 
RegWrite(rd, 64. d) 
raise TakenBranch 
enddef 
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Apr. 20, 2004 
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FIG. 21 E 
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2199 



Exceptions 

Reserved instruction 
Gateway disallowed 
Access disallowed by virtual address 
Access disallowed by tag 
Access disallowed by global TB 
Access disallowed by local TB 
Access detail required by tag 
Access detail required by local TB 
Access detail required by global TB 
Local TB miss 
Global TB miss 



FIG. 21F 
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^2210 



Operation codes 



E.SCALADO.F.16 


Ensemble scale add floatinq-point half 


E.SCALAOD.F.32 


Ensemble scale add floating-point stnqle 


E.SCALADD.F.64 


Ensemble scale add floating-point double 



Selection 



class 


op 


Pfec I 


scale add 


E.SCALADD.F 


16 32 64 i 



Format 

Eop.prec ra=rd.rc,rb 

ra=eopprec(rd v rc,rb) 

31 24 23 18 17 12 11 6 5 0 

I E»op.prec I rd I rc I rb | ra 

^ 6 6 6 6 ~ 



FIG. 22A 
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dBf BJsemblaRoaflngPdntTemafy(op,pfBC4d,ic,ib,ra) as 
d «- RegRead(fd, 128) 
c^RegReadtrCriasj 
b 4- RegRead(rb. 128) 
for I *- 0 to 128-prec by pree 

di«-F(prec,dKpreo.lJ) 

ci4-F(precq* pfec . 1 ..0 

ai +- fadd(fmu«dt. F{precb pwc . 1 „o)), fmu«ci. F(prec.b 2 . pr ec*i..p«c>)> 
a l*preo-U «- PackF(prec, ai, none) 
endfor 

RegWrite(ra, 128. a) 
enddef 



FIG. 22B 
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Operation codes 

\ G, BOO LEAN \ Group boolean 

Selection 



operation 


function (binary) 


function (decimal) 


d 


11110000 


240 


c 


11001100 


204 


b 


10101010 


176 


d&c&b 


10000000 


128 


(d&c)|b 


11101010 


234 


dldb 


11111110 


254 


d?c:b 


11001010 


202 


d^b 


10010110 


150 


-d*c*b 


01101001 


105 


0 


00000000 


0 



Format 

G. BOOLEAN rd@trc,trb.f 

rd=gbooleani{rd,rc,rb,f) 

31 252423 16 17 12 11 65 0 

| G. BOOLEAN jjh) fd \ rc \ fb | jj 
7 1 6 6 6 6 



FIG. 23A 



^ — 2310 

I 
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if f6-f5 then 

if f2=fi then 

if f2 then 

rc <- max(trc,trb) 
rb +- min(trc,trt>) 

else 

rc <- min(trc.trb) 
rb *- max(trc,trb) 

endif 

ih <-0 

fl<-0||f6||f7||f4j|f3||f0 

else 

if f2 then 

rc*- trb 
rb *- trc 

else 

rc *- trc 
rb<-trb 

endif 
ih<-0 

Hf6l|f7||f4||f3||f0 

endif 

else 

in*- 1 
if f6 then 

rc <- trb 
rb 4- trc 

« «- N II h II t> II f4 II f3 II fo 

else 

rc trc 
rb <-trb 

i, «-f2l|fll|f7||f4||f3l|f0 

endif 

endif 



FIG. 238 
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Definition 

def GroupBoolean (ih.rd.rc.rt.il) 
d <- RegRead{W. 128) 
c <- RegRead(rc. 128) 
b<-RegRead(rtU28) 
if ih=0 then 
ifH 5 *0then 

f «- «3 II «4 II M ii »2 1 Bi || (rort»* || «o 
1 f*-»3lMUn»4l|U2l|Hll|OH1||ilO 

endif 

^ f4-ii3iiOHiii«2ll«iliaslia4llHo 

endif 

for i 4-0 to 127 by size 

a i^ f Wil|CillM 
endfor 

RegWrite(rd, 128. a) 
enddef 



FIG. 23C 
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Operation codes 

I B.HINT l Bianch Hint 1 

format 

B.HINT badd.countrd 

bhmt(badd.count.rd) 

31 2423 1817 1211 65 0 

I B.M INOR 1 rd t count I simm I B.HINT I 

' 5 6 6 6 6 

simm «- badd-pc-4 



FIG. 24A 



